CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Sentiment Classification in Persian: Introducing a Mutual Information-based Method for Feature Selection

عنوان مقاله: Sentiment Classification in Persian: Introducing a Mutual Information-based Method for Feature Selection
شناسه ملی مقاله: ICEE21_377
منتشر شده در بیست و یکمین کنفرانس مهندسی برق ایران در سال 1392
مشخصات نویسندگان مقاله:

Ayoub Bagheri - Isfahan University of Technology, Isfahan, Iran
Mohamad Saraee - University of Salford, Manchester, UK
Franciska de Jong - University of Twente, Human Media Interaction

خلاصه مقاله:
With the enormous growth of online reviews in Internet, sentiment analysis has received more and more attention in information retrieval and natural languageprocessing community. Up to now there are very few researches conducted on sentiment analysis for Persian documents. Thispaper considers the problem of sentiment classification foronline customer reviews in Persian language. One of the challenges of Persian language is using of a wide variety ofdeclensional suffixes. Another common problem of Persian text is word spacing. In Persian in addition to white space as interwordsspace, an intra-word space called pseudo-space separates word’s part. One more noticeable challenge in customer reviews in Persian language is that of utilizing many informal or colloquial words in text. In this paper we study these challenges by proposing a model for sentimentclassification of Persian review documents. The proposed model is based on a lemmatization approach for Persian language and is employed Naive Bayes learning algorithm for classification. Additionally we present a new feature selection method based on the mutual information method to extract thebest feature collection from the initial extracted features. Finally we evaluate the performance of the model on amanually gathered collection of cellphone reviews, where the results show the effectiveness of the proposed model

کلمات کلیدی:
Sentiment classification, Sentiment analysis, Persian language, Feature Selection, Mutual information

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/208434/