ترکیبی از لغات شباهت هایی برای تشخیص پارافرها
|کد مقاله||سال انتشار||تعداد صفحات مقاله انگلیسی||ترجمه فارسی|
|156791||2018||15 صفحه PDF||سفارش دهید|
نسخه انگلیسی مقاله همین الان قابل دانلود است.
هزینه ترجمه مقاله بر اساس تعداد کلمات مقاله انگلیسی محاسبه می شود.
این مقاله تقریباً شامل 9187 کلمه می باشد.
هزینه ترجمه مقاله توسط مترجمان با تجربه، طبق جدول زیر محاسبه می شود:
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computer Speech & Language, Volume 47, January 2018, Pages 59-73
Paraphrase identification consists in the process of verifying if two sentences are semantically equivalent or not. It is applied in many natural language tasks, such as text summarization, information retrieval, text categorization, and machine translation. In general, methods for assessing paraphrase identification perform three steps. First, they represent sentences as vectors using bag of words or syntactic information of the words present the sentence. Next, this representation is used to measure different similarities between two sentences. In the third step, these similarities are given as input to a machine learning algorithm that classifies these two sentences as paraphrase or not. However, two important problems in the area of paraphrase identification are not handled: (i) the meaning problem: two sentences sharing the same meaning, composed of different words; and (ii) the word order problem: the order of the words in the sentences may change the meaning of the text. This paper proposes a paraphrase identification system that represents each pair of sentence as a combination of different similarity measures. These measures extract lexical, syntactic and semantic components of the sentences encompassed in a graph. The proposed method was benchmarked using the Microsoft Paraphrase Corpus, which is the publicly available standard dataset for the task. Different machine learning algorithms were applied to classify a sentence pair as paraphrase or not. The results show that the proposed method outperforms state-of-the-art systems.