CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Customizing Feature Decision Fusion Model using Information Gain, Chi-Square and Ordered Weighted Averaging for Text Classification

عنوان مقاله: Customizing Feature Decision Fusion Model using Information Gain, Chi-Square and Ordered Weighted Averaging for Text Classification
شناسه ملی مقاله: JR_ITRC-3-2_005
منتشر شده در در سال 1390
مشخصات نویسندگان مقاله:

Mohammad Ali Ghaderi
Behzad Moshiri
Nasser Yazdani
Maryam Tayefeh Mahmoudi

خلاصه مقاله:
Automatic classification of text data has been one of important research topics during recent decades. In this research, a new model based on data fusion techniques is introduced which is used for improving text classification effectiveness. This model has two major components, namely feature fusion and decision fusion; therefore, it is called Feature Decision Fusion (FDF) model. In the feature fusion component, two well-known text feature selection algorithms, Chi-Square (X۲) and Information Gain (IG) were used; this component applied Ordered Weighted Averaging (OWA) operator in order to make better feature selection. The second component, Decision fusion component, combined two kinds of results using the Majority Voting (MV) algorithm. The results were obtained with feature fusion and without feature fusion. To evaluate the proposed model, K-Nearest Neighbor (KNN), Decision Tree and Perceptron Neural Network algorithms were used for classifying Rueters-۲۱۵۷۸ dataset documents. Experiments showed that this model can improve effectiveness of text classification in accordance to both Microaveraged F۱ and Macro-averaged F۱ measures.

کلمات کلیدی:
Text classification, text categorization, document classification, document categorization, text feature selection, data fusion

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1426584/