دانلود مقاله ISI انگلیسی شماره 37574
ترجمه فارسی عنوان مقاله

مگامیکس حالت چهره: تست های حساب های بعدی و دسته بندی تشخیص احساسات

عنوان انگلیسی
Facial expression megamix: Tests of dimensional and category accounts of emotion recognition
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
37574 1997 43 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Cognition, Volume 63, Issue 3, 3 July 1997, Pages 271–313

ترجمه کلمات کلیدی
- حالت چهره - تشخیص احساسات
کلمات کلیدی انگلیسی
Facial expression .emotion recognition.
پیش نمایش مقاله
پیش نمایش مقاله  مگامیکس حالت چهره: تست های حساب های بعدی و دسته بندی تشخیص احساسات

چکیده انگلیسی

Abstract We report four experiments investigating the perception of photographic quality continua of interpolated (`morphed') facial expressions derived from prototypes of the 6 emotions in the Ekman and Friesen (1976)series (happiness, surprise, fear, sadness, disgust and anger). In Experiment 1, morphed images made from all possible pairwise combinations of expressions were presented in random order; subjects identified these as belonging to distinct expression categories corresponding to the prototypes at each end of the relevant continuum. This result was replicated in Experiment 2, which also included morphs made from a prototype with a neutral expression, and allowed `neutral' as a response category. These findings are inconsistent with the view that facial expressions are recognised by locating them along two underlying dimensions, since such a view predicts that at least some transitions between categories should involve neutral regions or identification as a different emotion. Instead, they suggest that facial expressions of basic emotions are recognised by their fit to discrete categories. Experiment 3 used continua involving 6 emotions to demonstrate best discrimination of pairs of stimuli falling across category boundaries; this provides further evidence of categorical perception of facial expressions of emotion. However, in both Experiment 1 and Experiment 2, reaction time data showed that increasing distance from the prototype had a definite cost on ability to identify emotion in the resulting morphed face. Moreover, Experiment 4 showed that subjects had some insight into which emotions were blended to create specific morphed images. Hence, categorical perception effects were found even though subjects were sensitive to physical properties of these morphed facial expressions. We suggest that rapid classification of prototypes and better across boundary discriminability reflect the underlying organisation of human categorisation abilities.

مقدمه انگلیسی

Introduction People are very skilled at understanding each other's facial expressions. We know that babies are very interested in faces (Johnson et al., 1991), and that they show precocious ability to respond to different facial expressions (Field et al., 1982). We also know that, for tests using a fixed range of alternative choices, certain configurations of facial features resulting from specific patterns of facial muscle movements are recognised throughout the world as corresponding to particular basic emotions (Ekman, 1992 and Ekman, 1994). Moreover, selective changes in ability to recognise emotion from the face have been reported after brain injury; sometimes, patients may remain able to recognise other social cues such as identity from the face, even though they have problems in recognising facial emotion (Calder et al., 1996b, Etcoff, 1984, Sprengelmeyer et al., 1996 and Young et al., 1993). The physiological literature also suggests differences in the neural coding of facial identity and expression in other primates (Desimone, 1991 and Hasselmo et al., 1989), and PET studies of humans have shown differences between brain regions involved in the analysis of identity and expression (Sergent et al., 1994), and demonstrated the possible existence of emotion-specific responses to facial expressions (Morris et al., 1996). These facts are consistent with the long evolutionary history of facial expressions of emotion (Darwin, 1872 and Ekman, 1973), but we know little about the perceptual basis of how emotions are recognised. One of the fundamental issues that is still disputed concerns whether facial expressions are perceived as varying continuously along certain underlying dimensions, or as belonging to qualitatively discrete categories (Ekman, 1982Ekman et al., 1972). This issue has been difficult to resolve because many data can be accommodated within either view, and hybrid models are sometimes proposed. For example, Woodworth and Schlosberg (1954)identified happiness (in which they included love and mirth), surprise, fear (including suffering), anger (and determination), disgust and contempt as distinct, recognisable categories of emotion, but then suggested on the basis of their confusabilities that they can be considered to be located around the circumference of a circle (running happiness–surprise–fear–anger–disgust–contempt–happiness) with two orthogonal diagonals corresponding to the dimensions pleasant–unpleasant (running from the region of happiness to the region of anger) and attention–rejection (running from the surprise and fear boundary to the boundary between disgust and contempt). This suggestion is shown in Fig. 1a. A modern variant of this idea is the Russell (1980)circumplex model, in which more extreme degrees of an emotion fall around the edge of a two-dimensional emotion space encoding orthogonal bipolar dimensions of pleasure and arousal, with milder emotions falling more toward the centre. Schematic representations of relation between different facial expressions. (a) ... Fig. 1. Schematic representations of relation between different facial expressions. (a) As proposed by Woodworth and Schlosberg (1954). (b) Continua used in Experiment 1. Figure options The intention of such models is to create a two-dimensional solution to the problem of perceptually classifying facial expressions. Woodworth and Schlosberg (1954)drew an analogy to the colour circle, in which hue is arranged around the circumference and saturation along red–green and blue–yellow opponent axes. In the present study, we took advantage of image-manipulation techniques to examine contrasting predictions which follow from accounts in which facial expressions of emotion are perceived and recognised as discrete categories or by locating them in terms of a small number of underlying dimensions. A consequence of dimensional accounts is that linear (or near-linear) physical transitions from one emotional expression to another will be accompanied by characteristic changes in identification. For example, in Woodworth and Schlosberg's (Woodworth and Schlosberg, 1954) schema (Fig. 1a), a transition from a happy face to an angry face cannot be effected without passing through a region close to the origin of the `emotion space', where the expression should become neutral or indeterminate. This must happen in the Woodworth and Schlosberg (1954)account because happiness–anger corresponds to one of the hypothesised dimensions of the perceptual emotion space (pleasant–unpleasant), but the same will be true of any transition between expressions lying at opposite points in the emotion space. Similarly, Fig. 1a shows that a transition from a happy to a frightened expression will not cross the origin of the two-dimensional emotion space, and thus need not involve any region where the expression becomes indeterminate, but it will enter a region where the expression should be seen as one of moderate surprise. In contrast, a transition from happiness to surprise does not involve entering the region of any other emotion, and may, therefore, be relatively abrupt, passing directly from one emotion to the other. Dimensional accounts, then, can be used to predict the consequences for identification of physically transforming one facial expression to another. A method for making such transformations is available from image-manipulation procedures, for which algorithms have been devised to allow manipulation of photographic-quality images of faces (Benson and Perrett, 1991 and Burt and Perrett, 1995). This is achieved by specifying the locations of a large number of facial feature points on prototype images. To create interpolated (`morphed') images depicting the continuum between two faces (say, pictures of a person with happy and sad expressions) in photographic quality, the positions of the features in one photograph are moved toward their positions in the other photograph, as if the image lies on a rubber sheet. The technique is described in detail later, and our figures show that highly regular changes can be achieved. It is possible, then, to change one facial expression to another by computer morphing, and from the Woodworth and Schlosberg (1954)schema we can predict the consequences for identification if this dimensional account is correct. These predictions contrast with those of an account of the perceptual identification of facial expressions as discrete categories. From the discrete category standpoint, Fig. 1a is not an accurate geometric representation, since the perceptual space will be multi-dimensional. A transition from any one category to any other category may, therefore, never involve passing through any region which will not be assigned to one end-point category or the other. From this perspective, changes in identification will always be relatively abrupt, with no need for any region in any continuum between two prototype expressions to correspond to a third emotion. Of course, predictions derived from Woodworth and Schlosberg's (Woodworth and Schlosberg, 1954) schema can only work fully to the extent that it is an accurate geometric representation, based on the correct underlying dimensions. This is important because we noted that other two-dimensional accounts, such as Russell's (Russell, 1980) circumplex model, do use different dimensions and hence postulate slightly different positioning of some emotions. There can thus be differences in the precise predictions made by different variants of two-dimensional accounts. However, what is important is that all two-dimensional accounts of emotion classification must share the property that they will predict that transitions between facial expressions of emotion will be variable in their effects, according to how these particular expressions align in the emotion space, and that at least some transitions between two expressions will involve regions of indeterminacy or a third emotion. Regardless of the exact predictions as to when these types of transition will arise, the dimensional accounts thus contrast with category-based models, for which this property is not a necessary feature; in a category-based account, the transition between any two emotions can always be perceptually abrupt. In previous studies of identification of morphed images of facial expressions by normal perceivers (Calder et al., 1996aEtcoff and Magee, 1992), only a limited number of continua have been tested at any given time. In a typical paradigm, subjects may be given a face varying along the continuum from expression `x' to expression `y', and asked whether its expression looks more like x or y. Sharp changes in identification along such continua have been found, but this Procrustean method may artificially exaggerate the abruptness of any transition, and hide the kinds of indeterminacy or outright shift to a different category predicted by dimensional accounts, since only the categories x or y are permitted as choices. In Experiments 1 and 2, then, we tested the contrasting predictions of dimensional and category-based accounts by morphing between all possible pairs of facial expressions of basic emotions from the Ekman and Friesen (1976)series, and examining recognition of stimuli from all continua in a common set of trials. Our results were inconsistent with two-dimensional accounts, and favoured a category-based model. A second type of prediction can be derived from the view that facial expressions of emotion are recognised as belonging to distinct perceptual categories. This prediction can also be seen using the analogy of Woodworth and Schlosberg (1954)to colour perception. One of the striking features of colour perception is that, although wavelength varies continuously, the appearance of the circumference of the colour circle to a person with normal colour vision involves discrete regions of relatively consistent colour, with comparatively abrupt transitions between. Psychophysical studies have confirmed that we are relatively insensitive to changes in wavelength if they occur within a region belonging to a colour category, and more sensitive to changes of the same physical magnitude occurring across the boundary between two colours (Bornstein and Korda, 1984). Many other examples of such phenomena have been described for perceptual categories that vary on a single underlying physical dimension (Harnad, 1987a). Collectively, they are known as categorical perception. The phenomenon widely regarded as the hallmark of categorical perception is that linear physical changes in a stimulus can have non-linear perceptual effects, with changes which occur near to or across category boundaries being easier to detect. Morphed images have also made it possible to explore categorical perception effects with multidimensional stimuli, such as faces (Beale and Keil, 1995Calder et al., 1996aEtcoff and Magee, 1992). The pioneering study was by Etcoff and Magee (1992), who took advantage of these technical developments to investigate categorical perception of facial expressions; they converted photographs from the Ekman and Friesen (1976)series of pictures of facial affect into line drawings, and used a computer program to create several series of drawings representing equal interpolated steps between two different facial expressions posed by the same individual. With these line drawing stimuli, Etcoff and Magee (1992)measured identification of the emotion seen in individual stimuli falling along a particular expression continuum (e.g., from happiness to sadness), and discrimination between pairs of these stimuli with an ABX task (in which stimuli A, B, and X were presented sequentially; subjects had to decide whether X was the same as A or B). For identification, Etcoff and Magee (1992)observed sharp boundaries between a region of each continuum perceived as corresponding to one expression, and a region corresponding to the other expression. In the ABX discrimination task, Etcoff and Magee (1992)found that people were more accurate at detecting the differences between pairs of drawings which crossed a subjective category boundary (such as between a drawing seen as happy in the identification task and a drawing seen as sad) than they were at detecting equal physical differences which lay within a category (i.e., between two drawings which would be identified as happy, or two drawings identified as sad). This was clear evidence of categorical perception of facial expressions. Calder et al. (1996a)used photograph-quality morphed images of expression continua to replicate Etcoff and Magee's (Etcoff and Magee, 1992) findings of categorical perception of facial expressions with photographic-quality images. Calder et al. (1996a)also found sharp boundaries between the region of each continuum perceived as corresponding to one expression, and the region corresponding to the other expression, and they showed that subjects' discrimination performance could be predicted from their identification performance with the same stimuli; discrimination was poorest for stimulus pairs in which identification of the emotion was most consistent, and discrimination was best when identification was least consistent. Findings of categorical perception of facial expressions again challenge the view that the perception of facial expressions is determined by a small number of underlying dimensions, such as pleasant–unpleasant. On a dimensional account, the perceptual function should be continuous from one emotion to the next; the categorical perception results show that it is not. In Experiment 3, we further examined this phenomenon, introducing a much more severe test of categorical perception than has been used in previous publications, by examining within-category and between-category discriminabilities for stimuli drawn from the six possible continua forming the perimeter of the hexagon shown in Fig. 1b. In addition, we adopted a task (simultaneous perceptual matching) which eliminated the memory requirements inherent in the ABX discrimination task used in studies based on methods from the speech perception literature. Enhanced discriminability was still found for stimuli falling across category boundaries. Although the results of Experiments 1–3 were inconsistent with a two-dimensional account of the perceptual classification of facial expressions of emotion, there were also aspects of the data which indicated that subjects could none the less often see differences between stimuli they assigned to a common perceptual category; reaction times for identification were affected by the closeness of a morphed image to the nearest prototype, and within-category discriminabilities were not always at chance level. In Experiment 4, we therefore examined systematically the extent to which subjects could recognise exactly which prototypes particular morphed images had been interpolated between; although not perfect, they did have some insight. These findings provide data which must be explained by any adequate model of facial expression recognition, and which allow us to reject two-dimensional accounts as inadequate models of the perceptual mechanisms involved. An account treating facial expressions as corresponding to discrete categories of emotion is more readily reconciled with our findings, but they are also inconsistent with the strongest possible variants of the categorical perception hypothesis. In our General Discussion, we examine which forms of category model can provide plausible accounts of the data we present.

نتیجه گیری انگلیسی

Results The issue of interest was whether subjects could see that a morphed image which was close to prototype A (e.g., one which is 90% happy) had been combined with prototype B (e.g., 10% anger). To investigate this, we created an arbitrary scale to indicate which emotions were seen in each morphed image. On this scale, the first choice emotion for each stimulus was given a score of 3, the second choice a score of 2, the third choice scored 1, and the other (unchosen) emotions scored 0. What we were looking for was whether people could see the combinations of expressions within the region consistently assigned to a particular emotion category; i.e., whether a morph lying close to prototype A was rated most like expression A, but next-most like B (where B is the expression it was blended with), and unlike C, D, E or F (expressions not used in that particular morph). The critical question then becomes that of whether the mean score for emotion B was higher than the mean scores for emotions C–F. A complicating factor is that some of the prototype facial expressions are more similar than others; hence, it is important to separate out cases where emotion B is seen in a morph simply because the prototype expression for emotion A resembles emotion B anyway. For this reason, the prototype images were included in Experiment 4, to give an estimate of the intrinsic similarities between JJ's facial expressions regardless of morphing. The scores for the relevant prototype were then subtracted from the scores for each of the morphed images, so that only differences between the perception of each emotion in a morphed image and the appropriate prototype were considered. These difference scores are presented in Fig. 7. Each of the graphs in Fig. 7 shows difference scores (y-axis) for the perception of each emotion in 90%, 70% and 50% morphs (x-axis). Each of the near-prototype expressions (i.e., the ones contributing 90%, 70% or 50% to each morph) is shown as a separate row. Moving from left to right along each row, the graphs show data for morphs moving toward each of the 6 different far-prototype expressions (i.e., those contributing 10% to 90% morphs, 30% to 70% morphs, or 50% to 50% morphs). Scores representing differences between the perception of each emotion in a ... Fig. 7. Scores representing differences between the perception of each emotion in a morphed image and the appropriate prototype. Each graph shows difference scores (y axis) for the perception of each emotion in 90%, 70% and 50% morphs (x axis). Each of the near prototype expressions (i.e., the ones contributing 90%, 70% or 50% to each morph) is shown as a separate row. Moving from left to right along each row, the graphs show data for morphs moving toward each of the 6 different far prototype expressions (i.e., those contributing 10% to 90% morphs, 30% to 70% morphs, or 50% to 50% morphs). Figure options Because these scores represent the differences from the relevant near prototype, the score for the perception of the near-prototype emotion quickly becomes negative as the morph moves away from the prototype. What is of interest, however, is what happens to the difference scores for the far-prototype emotions. Inspection of Fig. 7 shows that these are generally indistinguishable from the scores for other emotions at the 90% morph level (where the far prototype contributes 10%), start to rise above the scores for other emotions at the 70% morph level (where the far prototype contributes 30%), and are usually well clear of the rest at the 50% point. To analyze this statistically, we compared the difference scores for far prototypes to the mean difference score for the 4 unrelated emotions which were not involved in that continuum (i.e., excluding the near and far prototypes). A two-factor Analysis of Variance was used to determine the effects of distance from the prototype (90%, 70%, 50% morphs; repeated measure) and type of emotion (far prototype versus unrelated; repeated measure). This showed significant main effects of both factors (distance from prototype, F=104.77, df 2,58, p<0.001; type of emotion, F=87.63, df 1,29, p<0.001). Both of these main effects were qualified by the significant interaction shown in Fig. 8 (distance from prototype X type of emotion, F=111.64, df 2,58, p<0.001). Analysis of this interaction with post hoc Tukey tests (α=0.05) showed that perception of the far-prototype emotion was significantly different from the perception of unrelated emotions at the 50% level and at the 70% level, but not at the 90% level (where the far prototype only contributes 10% to each morphed image). Interaction of distance from prototype with type of emotion; the scores ... Fig. 8. Interaction of distance from prototype with type of emotion; the scores represent differences between the perception of an emotion in a morphed image and the appropriate prototype. Ability to perceive the far prototype emotion was significantly different from the perception of unrelated emotions (i.e., the 4 emotions which were not used in the creation of a particular morph) at the 50% level and at the 70% level, but not at the 90% level. Figure options Note that in Fig. 8, where any overall similarity between prototype expressions has already been removed by subtraction, the mean score for perception of unrelated expressions is effectively zero at all levels of morphing. This is exactly as would be expected if the emotions which do not come from one of the prototypes are not perceived in the morphed images. The finding that this holds for 50% morphs is consistent with the identification results from Experiments 1 and 2, where we noted that there were few intrusions from unrelated categories at the mid-point of each continuum.