یک مجموعه احساسات وبلاگ برای تجزیه و تحلیل ابراز هیجانی در چین
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
37956 | 2010 | 24 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computer Speech & Language, Volume 24, Issue 4, October 2010, Pages 726–749
چکیده انگلیسی
Abstract Weblogs are increasingly popular modes of communication and they are frequently used as mediums for emotional expression in the ever changing online world. This work uses blogs as object and data source for Chinese emotional expression analysis. First, a textual emotional expression space model is described, and based on this model, a relatively fine-grained annotation scheme is proposed for manual annotation of an emotion corpus. In document and paragraph levels, emotion category, emotion intensity, topic word and topic sentence are annotated. In sentence level, emotion category, emotion intensity, emotional keyword and phrase, degree word, negative word, conjunction, rhetoric, punctuation, objective or subjective, and emotion polarity are annotated. Then, using this corpus, we explore these linguistic expressions that indicate emotion in Chinese, and present a detailed data analysis on them, involving mixed emotions, independent emotion, emotion transfer, and analysis on words and rhetorics for emotional expression.
مقدمه انگلیسی
. Introduction Emotions play important role in human intelligence, rational decision making, social interaction, perception, memory, learning, creativity, and more (Picard, 1997). There is plenty of evidence that emotion analysis has many valuable applications. In everyday life people express their emotions through multiple modalities: their linguistic contents, speech, faces and their bodies. All of the cues can be used to convey emotional messages. Textual affect sensing is becoming increasingly important due to augmented communication via computer mediated communication (CMC) Internet sources such as weblogs, emails, website forums, and chat rooms. Especially, blogspace consists of millions of users who maintain online diaries, containing frequently-updated views and personal remarks about a range of issues (Mishne, 2005). Textual emotion analysis also can reinforce the accuracy of sensing in other modalities like speech or facial recognition, and to improve human computer interaction systems. Werry (1996) points out that in Internet relay chat (IRC), linguistic strategies have been adopted to replace the missing intonational and paralinguistic cues of face-to-face discourse (Werry, 1996). This finding is reflected in the use of coordination devices in Hancock and Dunham’s (2001) study of computer-mediated task-based interactions (Hancock and Dunham, 2001). Despite the increased focus on analysis of web content, there has been limited emotion analysis of web contents, with the majority of studies focusing on sentiment analysis or opinion mining. Classifying the mood of a single text is a hard task; state-of-the-art methods in text classification achieve only modest performance in this domain (Mishne, 2005). In this area, some of the hardest problems involve acquiring basic resources. Corpora are fundamental both for developing sound conceptual analysis and for training the emotion-oriented systems at different levels: to recognise user emotions, to express appropriate emotions, to anticipate how a user in one state might respond to a possible kind of reaction from the machine, and other emotion processing applications. In this study we propose an emotional expression space model in text, and describe a relatively fine-grained annotation scheme, annotating emotion in text at three levels: document, paragraph, and sentence. In document and paragraph levels, emotion category, emotion intensity, topic words and topic sentences are annotated. In sentence level, annotation includes emotion category, emotion intensity, emotional keyword/phrase, degree word, negative word, conjunction, rhetoric, punctuation, objective/subjective, and emotion polarity. We explore all of these linguistic expressions that indicate emotion in Chinese, and present a detailed data analysis on them, involving mixed emotions, independent emotion, emotion transfer, POS (part-of-speech) of emotional keywords, multiple emotional keywords and phrases and rhetorics for emotional expression. The annotation scheme has been employed in the manual annotation of a corpus containing 500 documents, with 4004 paragraphs, 12,742 sentences, and 324,571 Chinese words. The remainder of this paper is organized as follows. Section 2 presents a review of current emotion corpora for textual emotion analysis. Section 3 describes emotional expression space model in text. Section 4 describes the annotation scheme of this corpus. Section 5 presents the inter-annotator agreement study. Section 6 describes data analysis on Chinese emotional expression. Section 7 is the discussions. Section 8 concludes this study with closing remarks and future directions.
نتیجه گیری انگلیسی
Conclusions In this study we proposed an emotional expression space model, which is hierarchical in consistent with the natural structure of a document. Based on this model, we described a relatively fine-grained annotation scheme and annotated emotion in text at three levels: document, paragraph, and sentence. In document and paragraph levels, emotion category, emotion intensity, topic keywords and topic sentences are annotated. In sentence level, annotation includes emotion category, emotion intensity, emotion keyword/phrase, degree word, negative word, conjunction, rhetoric, punctuation, objective/subjective, and emotion polarity. We also gave the inter-annotator agreement study on annotation of emotion classes, emotion keywords/phrases, objective/subjective and so on. In addition, we explored the linguistic expressions that indicate emotion in Chinese, and present a detailed data analysis on them, involving mixed emotions, independent emotion, emotion transfer, POS (part-of-speech) of emotion keywords, multiple emotion keywords and phrases and rhetoric for emotional expression. Data illustrate that accompanying emotions and transfer emotions have stability. Emotion of love has high independence, however, emotions of joy, surprise and angry has relative low independence. Data also show that verbs, nouns, adjectives and adverbs are strong markers of emotion in Chinese. Through an experiment on deciding the role of emotional keywords and phrases for text emotional expression, we found that the emotion of text can be determined from its emotional keywords and phrases at the degree of about 69%, and the remaining about 31% need to rely on more grammatical or semantic analysis, such as negative words, degree words, and syntactic structures. By now, the corpus has contained 500 documents, with 4004 paragraphs, 12,742 sentences, and 324,571 Chinese words. However, the size seems not enough for large scale textual emotion analysis, a lot of linguistic features are not reflected from it. So more annotation is still ongoing. The expected size of this corpus is 1500 articles, with about 40,000 sentences. Moreover, the next work to be done in the future is to investigate generating emotion vectors for sentences by using different linguistic features. After these two applications we plan to explore: emotion summarization and emotion question answering. Finally, we plan to establish a multimodal emotion recognition and expressivity analysis system by combining textual emotion analysis with speech and facial emotion analysis.