Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines

سال انتشار: 1395
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 635

فایل این مقاله در 24 صفحه با فرمت PDF و WORD قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

CITCOMP01_030

تاریخ نمایه سازی: 16 شهریور 1395

چکیده مقاله:

Feature selection poses serious challenges to experts in the field. This article investigates how they cope with such challenges. Thus, after explaining dimension reduction techniques related to feature extraction and feature selection, it explores concepts, principles and existing feature selection methods for classifying and clustering data. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. While classifying feature selection algorithms, experts usually use a proposed categorizing framework to provide them with guidelines to choose appropriate algorithm(s) for each application. Consequently, it is suggested that similar algorithms with the same evaluation criteria and following the same process for finding a subset, should be placed in the one block. Meanwhile, empty blocks, indicating that no algorithm has been designed for them, provide a motive for a new search. This, in turn, allows design principles of an intelligent system for intelligent feature selection to be established consistent with the proposed categorization. In the next stage, a platform is developed as an intermediate step toward developing an intelligent feature selection system, involving crucial, decisive and effective factors in feature selection process. This procedure enables experts to select the appropriate algorithm(s) for a multi-purpose application and to partially fulfill its demands. The procedure increases accuracy in classification and goodness of clusters. Finally, employing several algorithms, the authors examine a simple meta-algorithm in order to design an intelligent integrated system of feature selection that uses individual algorithms. Some of the problems and challenges facing the current and future feature selection processing are also discussed.

نویسندگان

Ali Asghar Nadri

Department of Computer Engineering, College of Engineering, Yasooj Branch, Islamic Azad University, Yasooj, Iran

Farhad Rad

Department of Computer Engineering, College of Engineering, Yasooj Branch, Islamic Azad University, Yasooj, Iran

Hamid Parvin

Department of Computer Engineering, College of Engineering, Yasooj Branch, Islamic Azad University, Yasooj, Iran