CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Improving the Classification of Unknown Documents by Concept Graph

عنوان مقاله: Improving the Classification of Unknown Documents by Concept Graph
شناسه ملی مقاله: CSICC14_095
منتشر شده در چهاردهمین کنفرانس بین المللی سالانه انجمن کامپیوتر ایران در سال 1388
مشخصات نویسندگان مقاله:

Morteza Mohaqeqi - ECE Department, University of Tehran, Tehran, Iran
Reza Soltanpoor - Computer Department, Islamic Azad University of Tehran North branch, Tehran, Iran
Azadeh Shakery - ECE Department, University of Tehran, Tehran, Iran

خلاصه مقاله:
Concept graph is a graph that represents the relationships between language concepts. In this structure the relationship between any two words is demonstrated by a weighted edge such that the value of this weight is interpreted as the degree of the relevance of two words. Having this graph, we can obtain most relevant words to a special term. In this paper, we propose a method for improving the classification of documents from unknown sources by means of concept graph. In our method, initially some features are selected from a training set by a well-known feature selection algorithm. Then, by extracting most relevant words for each class from the concept graph, a more effective feature set is produced. Our experimental results identify an improvement of 1% and 8% in precision and recall measures, respectively.

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/73060/