Persian Wordnet Construction using Supervised Learning

سال انتشار: 1396
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 234

فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_ITRC-9-2_005

تاریخ نمایه سازی: 20 اسفند 1399

چکیده مقاله:

This paper presents an automated supervised method for Persian wordnet construction. Using a Persian corpus and a bi-lingual dictionary, the initial links between Persian words and Princeton WordNet synsets have been generated. These links will be discriminated later as correct or incorrect by employing seven features in a trained classification system. The whole method is just a classification system which has been trained on a train set containing a pre-existing Persian wordnet, FarsNet, as a set of correct instances. A set of some sophisticated distributional and semantic features is proposed to be used in the classification system. Furthermore, a set of randomly selected links have been added to training data as incorrect instances. The links classified as correct are collected to be included in the final wordnet. State of the art results on the automatically derived Persian wordnet is achieved. The resulted wordnet with a precision of 91.18% includes more than 16,000 words and 22,000 synsets.

کلیدواژه ها: