Document Classification Using Novel Competitive Neural Text Classifier

سال انتشار: 1389
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 146

فایل این مقاله در 13 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_ITRC-2-4_003

تاریخ نمایه سازی: 23 فروردین 1401

چکیده مقاله:

Text categorization is one of the well studied problems in data mining and information retrieval. Given a large quantity of documents in a data set where each document is associated with its corresponding category. This research proposes a novel approach for English and Persian documents classification with using novel method that combined competitive neural text categorizer with new vectors that we called, string vectors. Traditional approaches to text categorization require encoding documents into numerical vectors which leads to the two main problems: huge dimensionality and sparse distribution. Although many various feature selection methods are developed to address the first problem, the reduced dimension remains still large. If the dimension is reduced excessively by a feature selection method, robustness of document categorization is degraded. The idea of this research as the solution to the problems is to encode the documents into string vectors and apply it to the novel competitive neural text categorizer as a string vector. Extensive experiments based on several benchmarks are conducted. The results indicated that this method can significantly improve the performance of documents classification up to ۱۳.۸% in comparison to best traditional algorithm on standard Reuter ۲۱۵۷۸ dataset.

کلیدواژه ها:

Data mining ، text categorization ، vector based model ، competitive neural text categorizer

نویسندگان

Seyyed Mohammad Reza Farshchi

Artificial Intelligence Department, and Advance Research Center (ARC) Islamic Azad University Mashhad Branch, Iran

Mohammad Bagher Naghibi Sistani

Electrical Engineering Department Ferdowsi University of Mashhad Mashhad, Iran