دانلود مقاله ISI انگلیسی شماره 133800
ترجمه فارسی عنوان مقاله

تعویض دو زبانه با پیاده روی های تصادفی بیش از واژه های چند زبانه

عنوان انگلیسی
Bilingual embeddings with random walks over multilingual wordnets
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
133800 2018 45 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Knowledge-Based Systems, Available online 11 March 2018

پیش نمایش مقاله
پیش نمایش مقاله  تعویض دو زبانه با پیاده روی های تصادفی بیش از واژه های چند زبانه

چکیده انگلیسی

Bilingual word embeddings represent words of two languages in the same space, and allow to transfer knowledge from one language to the other without machine translation. The main approach is to train monolingual embeddings first and then map them using bilingual dictionaries. In this work, we present a novel method to learn bilingual embeddings based on multilingual knowledge bases (KB) such as WordNet. Our method extracts bilingual information from multilingual wordnets via random walks and learns a joint embedding space in one go. We further reinforce cross-lingual equivalence adding bilingual constraints in the loss function of the popular Skip-gram model. Our experiments on twelve cross-lingual word similarity and relatedness datasets in six language pairs covering four languages show that: 1) our method outperforms the state-of-the-art mapping method using dictionaries; 2) multilingual wordnets on their own improve over text-based systems in similarity datasets; 3) the combination of wordnet-generated information and text is key for good results. Our method can be applied to richer KBs like DBpedia or BabelNet, and can be easily extended to multilingual embeddings. All our software and resources are open source.