CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Thematic Similarity Multiple-Choice Question Answering with Doc۲Vec: A Step Toward Metaphorical Language Processing

عنوان مقاله: Thematic Similarity Multiple-Choice Question Answering with Doc۲Vec: A Step Toward Metaphorical Language Processing
شناسه ملی مقاله: JR_ITRC-12-2_005
منتشر شده در در سال 1399
مشخصات نویسندگان مقاله:

Soroosh Akef - Sharif University of Technology
Mohammad Hadi Bokaei - Iran Telecommunication Research Center
Hossein Sameti - Sharif University of Technology

خلاصه مقاله:
This paper reports our improvement over the previous benchmark of the task of answering poetic verses' thematic similarity multiple-choice questions (MCQs). In this experiment, we have trained a Doc۲Vec model on a corpus of Persian poems and proceeded to use the trained model to get the vector representations of the poetic verses. Subsequently, the poetic verse among the options with the highest cosine similarity to the stem verse was selected as the correct answer by the model. This model managed to answer ۳۸% of the questions correctly, which was an improvement of ۶% over the previous benchmark. Provided that a large-scale thematic similarity MCQ dataset is developed, the performance of a language representation model on this task could be considered as a novel benchmark to measure the capacity of a model to understand metaphorical language.

کلمات کلیدی:
Doc۲Vec, MCQ answering, computational linguistics, poetry, figurative speech, digital humanities.

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1400057/