CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

PSO Algorithm for Text Clustering Based on Latent Semantic Indexing

عنوان مقاله: PSO Algorithm for Text Clustering Based on Latent Semantic Indexing
شناسه ملی مقاله: IDMC04_093
منتشر شده در چهارمین کنفرانس داده کاوی ایران در سال 1389
مشخصات نویسندگان مقاله:

Eisa Hasanzadeh - Qazvin Islamic Azad University, Electrical & computer engineering faculty, Qazvin, Iran
Hamid Hasanpour - Faculty of Computer & IT Engineering, Shahrood University of Technology, Iran

خلاصه مقاله:
In this paper we develop a PSO algorithm based on latent semantic indexing (PSO+LSI) for text clustering. Main problem of text clustering algorithm is very high dimension because in vector space model (VSM) each term represent one dimension. Latent semantic indexing (LSI) is a technique that can reduce high dimension textual data. PSO family of bio-inspired algorithms has recently successfully been applied to a number of real word clustering problems. We use a adaptive inertia weight (AIW) that do proper exploration and exploitation in search space. PSO can merge with LSI to achieve best clustering accuracy and efficiency. the superiority of PSO+LSI over PSO+Kmeans clustering algorithm is demonstrated in two dataset (Hamshahri & Reuters).

کلمات کلیدی:
Vector Space Model ; PSO Algorithm ; Latent Semantic Indexing ; Text Clustering; Adaptive Inertia Weight;

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/109091/