Risk Classification of Imbalanced Data for Car Insurance Companies: Machine Learning Approaches

سال انتشار: 1401
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 144

فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IJMAC-12-3_001

تاریخ نمایه سازی: 22 فروردین 1402

چکیده مقاله:

This paper presents a mechanism for insurance companies to assess the most effective features to classify the risk of their customers for third party liability (TPL) car insurance. Basically, the process of underwriting is carried out based on the expert experiences and the industry suffers from lack of a systematic method to categorize their policyholders with respect to the risk level. We analyzed ۱۳,۳۸۸ observations of an insurance claim dataset from body injury reports provided by an Iranian insurance company. The main challenge is the imbalanced dataset. Here we employ logistic regression and random forest with different resampling of the original data in order to increase the performance of models. Results indicate that the random forest with the hybrid resampling methods is the best classifier and furthermore, victim age, premium, car age and insured age are the most important factors for claims prediction.

نویسندگان

Farzan Khamesian

Insurance Research Center, Tehran, Iran

Maryam Esna-Ashari

Insurance Research Center, Tehran, Iran

Eric Dei Ofosu-Hene

Department of Accounting and Finance, Faculty of Business and Law, De Montfort University, Leicester, UK

Farbod Khanizadeh

Insurance Research Center, Tehran, Iran