CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Multi-label text categorization using Error-Correcting Output Coding with weighted probability

عنوان مقاله: Multi-label text categorization using Error-Correcting Output Coding with weighted probability
شناسه ملی مقاله: JR_IJE-35-8_008
منتشر شده در در سال 1401
مشخصات نویسندگان مقاله:

BALAMURUGAN VELAN - Department of ECE,Sathyabama Institute of Science and Technology,Old Mamallapuram Road,
Vedanarayanan V - Department of ECE, Sathyabama Institute of Science and Technology, Oldmamallapuram Road
Sahaya Anselin Nisha A - Department of ECE, Sathyabama Institute of Science and Technology, Oldmamallapuram Road
Narmadha R - Department of ECE, Sathyabama Institute of Science and Technology, Oldmamallapuram Road
AMIRTHALAKSHMI. T.M - Department of Electronics and Communication Engineering, SRM Institute of Technology, Ramapuram, Chennai, India

خلاصه مقاله:
In several real-world categorization problems, labeled data is generally hard to acquire when there is a huge number of unlabeled data. Hence, it is very important to devise a novel approaches to solve these problems, thereby choosing the most valuable instances for labeling and creating a superior classifier. Several existing techniques are devised for the binary categorization issues, only a limited number of algorithms are designed for handling the multi-label cases. The multi-label classification problem turns out to be more complex when the sample belongs to multiple labels from the group of accessible classes. In World Wide Web, text data is generally present nowadays, and is an obvious example for such type of tasks. This paper develops a novel technique to perform the multi-label text categorization by modifying the Error-Correcting Output Coding (ECOC) approach. Here, a cluster of binary complimentary classifiers are employed to facilitate the ECOC more effective for the multi-class problems. In addition, a weighted posterior probability is computed to enhance the multi-label text classification performance more effectively. Moreover, the performance of the proposed ECOC with weighted probability is analyzed using the performance metrics, like precision, recall, and f-measure with maximal precision of ۰.۸۹۷, higher recall value of ۰.۸۹۶, and maximum f-measure of ۰.۸۹۵.

کلمات کلیدی:
Text Categorization, Multi-Label Classification, Multi-label Text categorization, Error correcting output coding, Posterior Probability

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1437704/