Dimension and computation reduction approach for K-Means clustering algorithm for Big Data

سال انتشار: 1401
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 240

فایل این مقاله در 5 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

DCBDP07_017

تاریخ نمایه سازی: 7 خرداد 1401

چکیده مقاله:

This paper proposes a method to reduce the computations of the K-Means clustering algorithm for big data. First, with the PCA algorithm, the dimensions of datasets are reduced to one or two dimensions, and then with using the information of distance from one point to its two nearest centers and their changes in the last two iterations lead to an increase of the speed and quality of the K-Means algorithm.Using real samples and experiments, it was ensured that at the best case the speed of the proposed method was improved by ۹۵.۹۱% and the quality of the proposed method was improved by ۹۹.۷۱%. These findings show that the proposed method is very useful for big data.

نویسندگان

Mahdi Yazdian-Dehkordi

Assisitant Professor of Artificial Intelligence Computer Engineering Department, Yazd University yazd, iran

Fatemeh Moodi

Ph.D. student of Computer Engineering Computer Engineering Department, Yazd University yazd, iran